Welcome to Data Handling 2023!

  • Go to this page (or use the QR code): https://bit.ly/datahandling-2022
  • Use one row to respond to the questions in the column headers (see the first two rows for examples).

“~/Documents/Data Handling 2023/datahandling2023/img”

Background

‘Data Science’?

“This coupling of scientific discovery and practice involves the collection, management, processing, analysis, visualization, and interpretation of vast amounts of heterogeneous data associated with a diverse array of scientific, translational, and inter-disciplinary applications.”

University of Michigan ‘Data Science Initiative’, 2015

But, what about statistics?!

“Seemingly, statistics is being marginalized here; the implicit message is that statistics is a part of what goes on in data science but not a very big part. At the same time, many of the concrete descriptions of what the DSI will actually do will seem to statisticians to be bread-and-butter statistics. Statistics is apparently the word that dare not speak its name in connection with such an initiative!”

David Donoho (2015). 50 years of Data Science

What’s new about all this?

“All in all, I have come to feel that my central interest is in data analysis, which I take to include, among other things: …”

What’s new about all this?

“All in all, I have come to feel that my central interest is in data analysis, which I take to include, among other things: procedures for analyzing data, techniques for interpreting the results of such procedures, ways of planning the gathering of data to make its analysis easier, more precise or more accurate, and all the machinery and results of (mathematical) statistics which apply to analyzing data.”

What’s new about all this?

John Tukey (The Future of Data Analysis, 1962!)

Technological change

Relevance for modern economic research

Relevance for modern economic research

Relevance for modern economic research

Relevance for modern economic research

Data science in economics skill set

Organisation of the Course

Our Team - At Your Service

Andrea Burro Matthias Rösti Aurélien Sallin

Introduction: Aurélien Sallin

  • 2022-today: Expert in Health Care Research, SWICA Health Organization, Winterthur
  • 2022-today: Post-Doc researcher and lecturer, HSG
  • 2018-2022: PhD Economic and Finance, HSG

Previously:

Introduction: Aurélien Sallin

Research (SWICA?)

  • Using Real-World Data from claims to assess effectiveness of health technological tools
  • Using (Causal) Machine Learning to evaluate the effect of health policies on doctors’ prescription behaviors
  • Financing models for mandatory health care in Switzerland

Other Research in Economics of Education

  • Missclassification rates for gifted students
  • Evaluation of Special Education programs

Course Structure

Course concept: lectures

  • Lectures (Thursday morning)
    • Background/Concepts
    • Illustration concepts
    • Illustration of ‘hands-on’ approaches

Course concept: special lectures

  • 27.10.2022: Industry Insights
    • Ulrich Matter: Web Data
    • Aurélien Sallin: Text as Data
    • Michael Tüting: Images as Data

24/11/2022: Guest lecture: Economic Data Science, SNB

Dr. Matthias Gubler Dr. Helge Liebert
Head of Economic Data Science, SNB
Swiss National Bank Economist, SNB

Course concept: exercises

  • Exercise sheets (handed out every other week)
    • Some conceptual questions
    • Hands-on exercises/tutorials in R
    • Detailed solution videos
    • First Exercises (set up R/RStudio) is available on StudyNet/Canvas today

Course concept

  • Learning mode in this course: Prepare with reading, visit the lecture, recap key concepts in lecture notes (self-study), work on exercises, watch solution video, come to exercise session, repeat…

  • Strongly encouraged: (virtual) learning groups!

    • Biweekly exercises provide opportunity.
    • Tackle the tricky exercises together!

Course concept: exercise sessions

  • In-class exercise sessions (bi-weekly evening sessions)
    • Discussion of exercises and additional input
    • Recap of concepts
    • Q&A, support
    • time for more coding!

Part I: Data (Science) fundamentals

Date Topic
21.09.2023 Introduction: Big Data/Data Science, course overview
28.09.2023 Programming with R
28.09.2023 Exercises/Workshop 1: Tools, programming
05.10.2023 An introduction to data and data processing
12.10.2023 Data storage and data structures
12.10.2023 Exercises/Workshop 2: Data storage and data structures
19.10.2023 Web data, text, and images
26.10.2023 Data sources, data gathering, data import
26.10.2023 Exercises/Workshop 3: Web data, text, and images

Part II: Data gathering and preparation

Date Topic
16.11.2023 Data preparation and manipulation
23.11.2023 Basic statistics and data analysis with R
23.11.2023 Exercises/Workshop 4: Data gathering, data import
30.11.2023 Visualisation, dynamic documents

Part III: Analysis, visualisation, output

Date Topic
07.12.2023 Guest Lecture: Matteo Courthoud (Senior Economist and Data Scientist @Zalando)
07.12.2023 Exercises/Workshop 5: Data preparation and applied data analysis with R
14.12.2023 Guest Lecture: Florian Chatagny (Head of Data Science @Federal Finance Administration in Bern)
21.12.2023 Exercises/Workshop 6: Visualization, dynamic documents
21.12.2023 Summary, Wrap-Up, Q&A, Feedback
21.12.2023 Exam for Exchange Students

Core course resources

  • All information and materials (notes, slides, course sheet, syllabus, etc.) are available on StudyNet/Canvas.
  • Core materials will also be made available on Nuvolos.

Main textbooks

Further resources

Exam information

  • Central, written examination: digital, BYOD!, we will have an instructional session by the head of the digital examinations team (data TBD).
  • Multiple choice questions.
  • A few open questions.
  • Theoretical concepts and practical applications in R (questions based on code examples).

Exam information II

  • We will release samples of multiple choice questions via Quizzes on Canvas/Studynet (exact same format and style of exam questions).
  • Exchange students who need to take the exam before the central exam block:

And now this…

Q&A

References